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O ■ Abstract 
(N ■ 

' Many models of epidemic spread have a common qualitative structure. The 

■ numbers of infected individuals during the initial stages of an epidemic can be well 

approximated by a branching process, after which the proportion of individuals 
•/^ I that are susceptible follows a more or less deterministic course. In this paper, we 

show that both of these features are consequences of assuming a locally branching 
structure in the models, and that the deterministic course can itself be determined 
from the distribution of the limiting random variable associated with the backward, 
susceptibility branching process. Examples considered include a stochastic version 
. of the Kermack & McKendrick model, the Reed-Frost model, and the Volz config- 

' uration model. 
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^ : 1 Introduction 
o 

■ Kermack & McKendrick's (1927) model of the course of an epidemic in a closed population 
has proved to be both effective in practice (see for example Brauer (2005), Brauer & 
Castillo-Chavez (2012) p. 350, Gupta et al. (2011)) and influential in the theoretical 
^ I development of epidemic modelling. Writing s(t) to denote the density of susceptible 
individuals in the population at time t and /3(f) the infectivity of an individual at time v 
after becoming infected, and normalizing the initial population density to be s(— oo) = 1, 
the development of s is given by the equation 

POO 

i-Ds{t)) = s{t) / l3{v){-Ds{t-v))dv. (1.1) 







Here, Ds denotes the derivative of s with respect to time, and is negative. The quantity 
{—Ds{t)) is the rate at which the density of susceptibles is being reduced at time t, and 
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this is just the (density standardized) rate at which infections are being made, explaining 
the integral on the right hand side of f ll.ip as the force of infection at time t. Dividing 
both sides of f ll.ip by s{t) and integrating gives 

POO 

-logs(t) = / I3{v){l- s{t-v)}dv. (1.2) 
Jo 

Note that, if s satisfies ( 11. 2p . so does any translate Sh defined by Sh(t) = s{t + h), for 
any /i G M. However, it is shown in Diekmann (1977) that there is exactly one solution s 
to (11. 2p that is non-increasing and non-negative, if, for instance, the value of s(0) G (0, 1) 
is specified. Letting t — )■ oo, (II. 2p gives the final size equation 

-logs(cx)) = i?o(l - s(oo)) , (1.3) 

where the basic reproduction number Rq := I3{v)dv is the expected total number 
of infections made by an infected individual in a susceptible population of unit density; 
a proportion 1 — s(oo) of the population has been infected by the end of the epidemic. 
Kermack & McKendrick (1927) then deduced their famous threshold theorem, that s(oo) < 
1 is only possible if i?o > 1- 

The final size equation can be interpreted more directly, without integrating (11.11) . but 
at the level of an individual. Rewrite (II. 3p in the form 

s{oo) = e-^o(i-^(°°», (1.4) 

and recognize i?o(l "^(oo)) as the total integrated force of infection over the whole course 
of the epidemic. The tacit assumption about force of infection at the level of the individual 
is that it represents the 'instantaneous rate' of infection of an individual, interpreted in a 
Markovian sense, so that the probability of an individual avoiding infection after exposure 
to an integrated force of infection / should be given by e^^ . Thus the right hand side 
of (11.41) is the probability that an individual avoids infection throughout the whole course 
of the epidemic, which is exactly the proportion s(oo) that remain uninfected to the end. 

The equation (11.41) . with s(oo) replaced by the symbol q, also has a classic inter- 
pretation in a branching process context. It represents the equation for the extinction 
probability g of a branching process starting with a single individual, when the number 
of offspring has the Poisson distribution Po (i?o) with mean Rq. At an individual level, 
this suggests a stochastic analogue of the Kermack-McKendrick model, in which an in- 
fected individual makes potentially infectious contacts according to a Poisson process of 
rate 13 (v), where v represents the time since infection. Each such event leads to a new 
infection, if the individual contacted is susceptible. In the early stages of an epidemic, 
almost all individuals are still susceptible, and so the early development of the epidemic is 
well approximated by a branching process, in which an individual at age v has (Markovian) 
birth rate /3{v). Branching processes have long been used to approximate the early stages 
of epidemic processes in this way. The earliest papers are those of Kendall (1956) and 
Whittle (1955), and a systematic treatment is given in Ball & Donnelly (1995). In partic- 
ular, the Kermack-McKendrick threshold theorem is replaced by a stochastic threshold 
theorem, in which the probability that a large epidemic takes place, when started by 
a single infected individual Kq in an initially susceptible population of large size A^, is 
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(approximately) 1 — q, thus being positive exactly when the mean number of offspring, 
here Rq, exceeds 1. 

In contrast, for the analysis of the final size A^ — S'(oo), where S{t) denotes the number 
of susceptibles at time t, the appropriate branching approximation is not at the beginning 
of the epidemic, but approximates the process of the contacts potentially leading to the 
infection of a randomly chosen individual K — see, for example, Diekmann & Heester- 
beek (2000), pp. 171-172. If this backward process of contacts contains few individuals, 
as when its branching approximation dies out, then K is unlikely to become infected, 
whereas, if it contains many individuals, as when the branching approximation never dies 
out, K is almost certain to become infected, if the epidemic is a large one. Thus the 
probability that a randomly chosen K does not become infected is approximately 1 if the 
epidemic starting from Kq is a small one, and approximately the extinction probability qb 
for the 'backward' branching process, if the epidemic is a large one. However, because 
of the random choice of K, the probability that K escapes infection is just N~^KS{oo). 
Hence 

l-N-'ESioo) ^ l-{q+il-q)qb} = (l-g)(l-g6), 

so that, given that the epidemic starting from Ko is a large one, the (mean of the) final 
proportion of infected individuals is close to {1 — qb). As it happens, for the stochastic 
Kermack-McKendrick model described above, the forward and backward branching pro- 
cesses are the same, so that qb = q, and fll.4p is still the relevant equation for determining 
the final outcome of the epidemic, with s{oo) replaced by qb- Thus, in the deterministic 
model, a large epidemic is certain, and the proportion of the population that is infected 
is (1 — qb). In the stochastic model, a large epidemic occurs only with probability approx- 
imately (1 — g), in which case a proportion of approximately {1 — qb) of the individuals 
are infected, and on the complementary event there is only a tiny outbreak involving 
a negligible proportion of infected individuals. However, if the epidemic were started 
with J > 1 individuals, the probability of a large outbreak, again leading to a proportion 
of approximately (1 — qb) of the individuals being infected, increases to (1 — q'), and is 
thus nearly a certain event if / is at all large. 

In this paper, we use analogous ideas to show that, under appropriate assumptions, the 
whole course of the stochastic epidemic is determined by the analysis of the two branching 
processes, forward and backward. There is an initial phase, approximated as usual by the 
forward branching process. If this branching process does not become extinct, it settles 
to an essentially deterministic course of exponential growth, after a random delay that 
results from the initial random development of the branching process. After the point 
at which the forward branching process ceases to be a good approximation, the propor- 
tion of susceptibles in the epidemic process follows an almost deterministic development, 
which can be expressed in terms of properties of the backward branching process. One 
of the consequences of this is to show that the Markovian stochastic interpretation of the 
instantaneous force of infection, which is implicit in the derivation of the deterministic 
Kermack-McKendrick equation (11.11) . is not actually necessary to justify the equation; we 
prove that (II. ip holds as a faithful approximation in a much wider class of models. 

We illustrate the approach for the Reed-Frost discrete generation epidemic model in 
a population of size A^. Let the probability of an infected individual infecting a given 
susceptible he p = fi/N. Then the approximating Galton- Watson forward branching 
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process has offspring distribution Po (/x) (and Rq = /i); we take fi > 1. After n time 
units, the number of individuals ahve in the branching process is Z„ ~ Wfi"' and the total 
number of individuals that were alive in previous generations is approximately W/i"'/ (/x — 
1), where W is the a.s. limit of Z^^i''^- Take 

n = n{N) [1 log TV/ log// J , 

so that = OnN'^/'^, where 1 < 9^ < ji, and suppose that > 0. Label those 
that have died in chronological order, with labels drawn independently and at random 
from [N] := {1, 2, . . . , A^}. Mark any whose labels have been used before, and all of their 
descendants, as 'ghosts'. There are only few marked, and those that are unmarked are the 
individuals that have been infected before time n in the epidemic. Let the set of labels 
used be denoted by L^v; its size is small compared to N. 

Now, starting from a randomly chosen individual, take an independent realization 
of the reversed branching process — in this model, it has the same law as the forward 
process — and run it for n{N) + r generations, after which there have been approximately 
yj/^n+r+i^ (yU — 1) individuals born in total, where W is the corresponding realization of the 
limit random variable, and is independent of W. Label these individuals in chronological 
order at random from [N] \ Ljv, and again mark the (few) ghosts; let the set of labels 
be L^, and denote by K the label of the initial individual. Do the same for the individuals 
alive in generation n of the forward process, and call this set L^^. If fl Ljl^ 7^ 0, and 
an element of the intersection is a non-ghost, we can construct a chain of infection to it 
from the initial individual in the epidemic, and a chain going from it to K, giving a chain 
of infection from the start of the epidemic to K. Conversely, any chain of infection from 
the start of the epidemic to K must pass through a non-ghost element of fl Lj^. Thus 
there is no chain of infection from the start of the epidemic to K exactly when fl Lj^ 
is empty or contains only ghosts; the event that L^^ fl L^^ is non-empty but contains only 
ghosts has only small probability. 

Now, given Z„ and the realization of the backward branching process, the mean num- 
ber of intersections between L% and L^^ is close to ZnW ji^'^'^'^^ / {ji — 1), and hence, 
using a Poisson approximation, the probability of the intersection being empty is close to 

exp{-Ar-iz„^?/i"+'-+V(/x-l)} = exp{-iV-i/2^„l?^^/i'-+V(/i-l)}. 

It is now easy to convert this result into the statement 

P[ir has escaped infection until generation 2n-\-r\ 

~ E{cxp{-Ar-V2^^|^^^^r+l/(^ _ I jr^Y 

where i^' is a randomly chosen label from all of [A^] and J-n denotes cr(Z;, < Z < n); in 
other words, still with n = n{N), 

E{N-'SN{2n + r) \ j;} ~ 7P{N-'/^ZJr,fJ,'+^/{f, - 1)), 

where S]\r(t) denotes the number of susceptibles in the epidemic at generation t and 
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But now, for two independently randomly chosen individuals K and K' , 



¥.{{N-^SM{2n + r)f\T^} 

= P[botli K and K' have escaped infection until generation 2n + r\ J^n] 

can be approximated in exactly the same way; since there is little overlap between the 
labels assigned to the backward branching processes starting from K and K', it is easy 
to deduce that 

also, implying that Var {A^^^S'Ar(2?7, + r) \J^n} ~ 0. Writing W = limm^oo ^myU"™", we 
note that Z„ = Zn{N) ~ Wfx"^^'' = OnN^W; this implies that, for any e > and any 
r e Z, 

limF[\N^^SNi2n{N)+r)-iP{Welfx''^^/{fi-l))\>e] =0. (1.5) 

The quantity tplWOlffi'^^^ /{fi — 1)) is random only through the presence of W. By 
time n{N) the quantity W is essentially determined, and is the same for all r G Z. 
If ly = 0, the above approximation is by tp{0) = 1 for all r, indicating that only a small 
epidemic occurs; the assumption fi > 1 merely ensures that F[W > 0] > 0, so that a large 
epidemic is indeed possible. 

U W > 0, one could describe the approximation slightly differently. The values of 
SN(2n{N) + r) for r G Z are then approximated by a discrete subset of points on 
the continuous deterministic curve u i-)- ifj^n'^^^ / {jj, — 1)), namely those with u of the 
form r + {loglV + 21og^7v}/log/U for r & "L. Thus randomness appears only as a time 
shift in the lattice of integer spaced points along the continuous deterministic path that 
are used for the approximation to the discrete time process. Note also that the times / 
at which N~^Sn{1) is not close either to or to 1 are within 0(1) of log A^/ log /i; the 
development of the epidemic is slow until almost time log N/ log /i, and then runs its 
course over comparatively few time steps. 

In what follows, we shall make these arguments precise, but for processes with non- 
lattice offspring distributions in continuous time. The phenomena associated with dis- 
cretization disappear, giving a neater result, but connecting the forward and backward 
branching processes becomes more delicate. Our analogue of f ll.Sp is proved in Theo- 
rem (12. 8p . under some fairly mild assumptions on the individual point processes of infec- 
tion that include the stochastic Kermack-McKendrick model described above for many 
choices of the infectivity function /3. It establishes that 

\im F[s\xp\N~^Sn{\'^ {\og N -\ogW + u}) - s{u)\> e] = 0, (1.6) 

for a deterministic function s, whenever W > Q\ here, A is the Malthusian parameter (as- 
sumed positive) and W the limiting random variable for the associated forward branching 
process, and s is determined by the properties of the associated backward branching pro- 
cess. The methods that we use have quite general application, and have already been 
exploited in Barbour & Reinert (2012) in the context of the Aldous(2010) gossip process 
and of the Moore & Newman (1999) small world model. 
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The key ingredients that make the proofs go through are the branching nature of the 
forward and backward processes, and their exponential growth and stabihty properties. 
These are also shared, for instance, by their multitype analogues. We give a multitype 
analogue of ( II. 6p in Section 13. and discuss a configuration model in Section 13. 2[ 

2 The single type model 
2.1 The branching processes 

We begin by considering an epidemic in a closed population of individuals, where N is to 
be thought of as large, that evolves according to the following scheme. Each individual i, 
1 < i < n, is equipped with a potential infection history, in the form of a realization of 
a point process on (0, oo). If i becomes infected at time a{i) < oo, it makes infectious 
contacts with other individuals at times cr{i,j) := a{i) + T{i,j), where < r(i, 1) < 
t(^,2) < ■ ■ ■ denote the times of the events of and := < oo their number; 

if required, can be augmented by a time T^{i) > T{i, z/(i)), indicating that i is removed 
from the infectious state at time cr{i) + T''(i). The individuals contacted are chosen 
independently at random from [A^], and an infectious contact only results in the individual 
contacted becoming infected if they have not previously been contacted. The epidemic 
begins with individual ii becoming infected at time = 0. After the r-th individual v 
has become infected at time o"(v), and if r < A^, then potential infectious contacts occur 
at the times <7{ir) + Vr{j), 1 < j < \Vr\, where the Vr{j) are the elements of 

Vr := {(r{ii,j) - a{ir), 1 < J < i^iti), 1 < / < r} n (0, oo), 

arranged in non-decreasing order, and the labels of the individuals to be contacted are 
given by Ir{j), J > 1, chosen independently and uniformly on [A^]. Defining the index 
j*(r) :=min{l <j< \Vr\: Ir{j) ^ {ii, ^2, • ■ • , v}}, then 

V+i := /r(j*(r)) and (t(v+i) = a{ir) + Vr{j^{r)), 

unless there is no such index j^:{r), in which case the epidemic stops. It is assumed that 
(6) 1 < < are independent and identically distributed. 

If the labelling were ignored, and j*(r) were taken to be 1 for each r > 1, and if the r-th 
infected individual were assigned infection history with the r > 1) independent 
and identically distributed, then the resulting process would be a Crump-Mode- Jagers 
branching process Z. Indeed, if the are distributed in the same way as ^i, the paths of 
the branching and epidemic processes (neglecting the labelling) can be coupled so as to 
agree exactly until p := min{r > 0; j*(r) > 2} (Ball 1983, Ball & Donnelly, 1995), with 
the epidemic process recoverable from the branching process by adding labelling, and by 
marking as 'ghosts' individuals infected in the branching process but not in the epidemic 
process — (j*('") ~ 1) such infections occur whenever j*(r) > 2 — together with the 
individuals in the branching process that are descended from such individuals. We shall 
make substantial use of this couphng, but only up to times where there have typically 
been relatively few ghosts created. 
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We shall make the following assumptions on the distribution of ,^1 of the above Crump- 
Mode- Jagers branching process. Let pj :— P[i/(1) — j] and n — Ei/(1); denote the relative 
intensity measure of ^1 by 

G{dt) /x"^ECi((it). (2.1) 

Assumptions 

1. We assume that the branching process is supercritical, and that 

1 < 11 < 00; 777.2 '■— ]Ei/(l)^ < 00. 
Let A > denote the Malthusian parameter of the branching process, satisfying 

^(^j\-^'Udt)^ = 1. (2.2) 

The existence of A > follows from Jagers (1975), Theorem 6.3.3, pp. 131-2. We 
write 

I'OO 

:= /iA / te~^^G{dt) < 00; (2.3) 
Jo 

then m*/A represents the mean age at child bearing (Jagers (1989), p. 195). 

2. The intensity measure G is non-lattice and has finite second moment. The support 
of G is a finite or semi-infinite open interval (a, b), and G{A) > J^g{x) dx for any 
A C (a, b), for some continuous positive density g. If 6 = 00, then also g{x) > kx~^ 
for all X > xo, for some xq > a, k > and 7 > 3. 

Remark. Strictly speaking, the epidemic might be better modelled by assuming that the 
labels assigned to the individuals infected by any given individual i are chosen at random 
without replacement from the labels excluding i, and indeed that the number infected by a 
single individual cannot exceed N — 1. However, under the assumption that 7772 < 00, the 
total variation distance between this distribution of labels and that being assumed here 
is at most |A^~^(r?72 + /i). Since we need only to consider the offspring of at most N^^^ 
individuals in our calculations, any difference between the results of the two models occurs 
with probability of order at most 0(iV~^/^), and does not affect the results proved in this 
paper. 

Letting the infection times in the branching process be denoted by (cr'(r), r > 1), and 
writing 

B'{t) := max{r: a'{r)<t} (2.4) 

for the number of births that have occurred in the branching process by time t, it fol- 
lows that W(t) := B'{t)e-^^ W a.s. for a non-negative random variable W , (Ner- 
man (1981), Theorem 5.4), and also that {W > 0} = {limf^oo-B'(t) = 00} a.s. (see (3.10) 
in Nerman (1981)). From Corollary 5.6 in Nerman (1981), and the fact that pointwise 
convergence to a continuous limit of non-decreasing bounded functions on [0, 00] is always 
uniform (Jagers (1975), p. 170), it also follows that the statistics of the set 

V'it) := {a'(/)+T'(/,j)-^, 1<J<^'(0, l<^<nn(0,oo). 
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where T'{l,j) denotes the j-th point of and := converge in distribution, as 

t — >■ oo, in the sense that, on {W > 0}, 

\im snp\ (\V'{t) n {0, s]\/\V'{t)\) - F{s)\ = a.s. . (2.5) 

Here F is the distribution function on ]R+ given by 

1-F{s) := r {l-e-^^''-'^)G{du). (2.6) 

For the epidemic, the corresponding quantities depend on the choice of N , because of 
the role played by the labelling in its definition. We define 

BNif) '■— niax{r: (T{ir) <t} 

and, in the natural notation. 

Provided that t is not too large, -Bjv(i) is not very much smaller than B'{t), and \V'{t) \ 
VN{t)\ is also relatively small. This is the case if we take 

t = tN{u) := X-\l\ogN + u), (2.7) 



for any fixed u > 0, since then B'{tN{u)) ~ We^yN, and hence the number of indices 
of [N] chosen more than once in the construction of the epidemic up to this time has 
mean 

iV-'(^'(*~("») ~ IH/V". 

of relative order 0{N~^/'^) when compared to B'{t]^{u)) as becomes large; this obser- 
vation is made precise later. 

Wc now suppose that 1^ > 0, and that the branching and epidemic processes have 
been coupled as described above up to the time tn '■= t(B', [VN\), where t{B' .r) := 
inf{t > 0: B'{t) > r} for any r > 0. We denote by J^tn the corresponding cr-field, 
including the information in the sets V'{tn) and VAr(r7v), but not that of the labels that 
are to be assigned to them for the epidemic process. Since B'{t)e'~^* — > W a.s. as t — >■ oo, 
it follows that B'{t—)/B'{t) — )• 1 a.s. also, and hence that limjv-).oo -^"^^^-^'(tjv) = 1 a.s. 
as A'" ^ oo. Thus 

Tjv = \-^{\ogB'{T^) - \ogW{TM)} ~ X-'{\\ogN -logW} 



as — )■ oo. Note that B'{tn) = [vA'"J if G is absolutely continuous with respect to 
Lebesgue measure. 

We now examine whether, and if so when, a randomly chosen individual K e [A^] 
becomes infected. To do so, we begin by writing 

Jn := [N] \ {ir, 1 < r < B'{tn)} (2.8) 
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to denote the set of indices that have not been used in the definition of the epidemic up 
to time Tjv, and we set 

Jni ■■= {jeJN- i^ij) = l}, Mm ■= \Jm\ and := ^ ^U) = ^IMm- 

ieJjv i>i 

(2.9) 



We then let 



GmA^) E mJ.k)<x] (2.10) 



Mmi 



denote the empirical distribution function of the times of the k-th in order potential 
infections of individuals that have / such in total, and write 

1 ' 1 

Gn{x) := ^5^M^,5^G^,,,(x) = ^^e,(0,x] (2.11) 

^ i>i k=i ^ jeJN 

for the overall empirical distribution of the infection times of individuals in J^v- We 
introduce the a-field 

K = J'r^y^iMj^ k),l<k< u{j),j e Jn}). (2.12) 

If i^' G [A^]\ Jat, it has already been infected during the epidemic process before time r^r; 
the conditional probability of this occurring is (n '■= N~^B'{tj^), and this is small. If not, 
it can only have been infected if there is a chain of infection running backwards from K 
to one of the |VAr(r7v)| individuals in that were infected by individuals in [N] \ J^, but 
at times after t^. Now the infection events originating from individuals in J^v are 
directed at independently and randomly chosen individuals in [A^] . Hence, K is potentially 
directly infected as a result of a set of Bi (Mat, l/iV)-many events; the individuals that 
infect K (its generation 1 predecessors) were themselves infected at times preceding the 
infection of K by amounts realized through a Bernoulli (1/A^) thinning of the set of Mtv 
times {r(j, A;), I < k < i^lj), j G Jn}- This procedure can be iterated to determine 
the predecessors in successive generations, with duplicate choices of a pair (j, k) leading 
to 'ghosts', as before. In this way, the susceptibility process, consisting of the chains of 
potential infection leading to K, can be generated from a branching process Zn with 
numbers of offspring having a binomial Bi (M^r, 1/A^) distribution, and occurring at times 
sampled independently from G^. 

For the purposes of asymptotics, it is inconvenient to have this branching process de- 
pendent on A^. With some associated error, it can be replaced with a branching process Z 
that has a Poisson Po (/i) offspring distribution, noting that 

/i := J2lpi ^ N-'M^ ^ ^ (2.13) 



i>i 



N\ 



with the birth times independently sampled from the distribution G defined in (12.11) . Note 
that we can write 

1 ' 

^ = -Ep'E^"^' (2.14) 

^ 1>1 k=l 
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where Gi^ is the distribution function of the time of the fc-th event in ^i, conditional 
on z/(l) = /. For this branching process, we can define B{t) to be the number of births 
up to time t, and conclude that, under our assumptions, by Theorem 5.4 and (3.10) of 
Nerman (1981), 

B{t)e-^' Wa.s., (2.15) 

for a random variable W that satisfies {W > 0} = {\imt^ooB{t) = oo} a.s. Furthermore, 
letting 

A{t) := {atir): l<r<B{t)}, 

where at{r) := t — cr{r) is the age at time t of the r-th individual, it also follows that, 
on {W > 0}, by Corollary 5.6 in Nerman (1981) together with the observation from p. 170 
of Jagers (1975), 



lim sup 



^ B(t) 

g— 5^/[at(r)<s]-(l-e-^^) =0 a.s.. (2.16) 



Note that, for any > 0, 



POO / POO \ 

j e-^'^iG{dt) = e'^'Udt)], 



so that the branching processes Z and Z indeed have the same Malthusian parameter A. 
We consider this branching process run until time t]sr{u) as in f l2.7p . and we show in the 
next section that it represents a good enough approximation to the process of chains of 
potential infection to K. 

Finally, we assign labels from J^v independently and at random to the individuals in 
the set Un, whose birth times are the elements of tn + Vn{ti\i) — these are the birth 
times in the forward epidemic process that have been determined by time tn, but have 
not occurred by then — and also to the set Un{u) composed of the distinct individuals 
among the B{tN{u)) that are born before tjsriu) in the reverse process. If the same label 
is chosen for an individual inUN, having birth time tj^ + vi, for some vi G Vn{tn), and for 
an individual in Un{u), with birth time a{r) < t]sf{u), then there is a chain of infection 
to K of length close to 

TN + vi + a{r) = X-'^{\og[VN\+l\ogN-logW{TN) + u} + vi-at^(^^){r) 
~ {log N -hgW + u} + vi- at^(u)ir); 

the actual length is + vi + aN{r), where o"Ar(r) is the birth time in the Z^ process. If, for 
any such pair, vi < atj^(u){r), so that the length of the chain of infection is no greater than 
A~^(log[V^J + I log — log W{tn) + m), and if the r-individual is not a ghost, then K 
is infected before this time; that is, approximately, before time A~^(log — logly + u). 

2.2 Approximating by Z 

The first step to be justified is that the branching process Zn with offspring numbers 
distributed according to the binomial Bi{Miy,l/N) distribution and with ages at birth 
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independently sampled from G^, as in fl2.9p and fl2.1ip . can be replaced in our con- 
siderations by the process Z, in which the offspring numbers have the Poisson Po (yu) 
distribution and the ages are sampled independently from G, as in f l2.13p and f l2.ip . We 
begin by showing that the two constructions lead to the same offspring numbers, with 
high probability conditional on J-"^ H A^v, at least until the first [A^^/^J sets of progeny 
have been sampled; here, € is a suitably chosen event, whose complement has 
small probability. 

Lemma 2.1 Let 

An := {\N~'Mr,-{l-CN)f^\<N-''/''}n{CN<N-'/\fi + l)}; (2.17) 

then ^[A^^] = 0(A^~^/^). On Aj^f, it is possible to construct realizations of Zj^ and Z on 
the same probability space, in such a way that the numbers of offspring in the first [N^^^\ 
sets of progeny in the two processes are identical with conditional probability 1 — 0{N~^/^). 

Proof: We begin by noting from f l2.9p that '■= YIj^Jn ^^^^ ^ of — B'{tn) 
independent and identically distributed random variables with mean n and finite variance. 
Hence, by Chebyshev's inequality, 

P[|iV-iMjv-(l-Cw)/i| >7V"^/'^] < iV-iE{z/(l)2}iV^/8 = 0(iV-i/«). (2.18) 

Then observe that 

B\tn) < 5'(0)+ ^ X,-, (2.19) 

i=i 

where Xj denotes the number of offspring of the j-th born individual (randomly ordered 
in the case of simultaneous births). Hence, with -B'(O) = 1 and since /U > 1, 

P[Cjv > Ar-i/2(^ + l)] < P[fi'(rjv) > l + (LViVj -l)/i + yiV] < Ar-i/2varz/(l), 

by Chebyshev's inequality. 

Now the total variation distance between Bi(MAr,l/A^) and Po(M7v/A^) is at most 
1/A^ (Barbour, Hoist & Janson (1992), (1.23)), so that branching processes with these 
two offspring distributions can be coupled so as to agree until after [A^^/*^] sets of progeny 
have been sampled with failure probability of at most N~'^^^ . Then, by considering the 
likelihood ratio, r independent samples from Poisson distributions with means /i and ^' 
can be distinguished with probability at most (iry (Po (r/i), Po (r/i')) < r|/i — ii'\l^/rjl] , 
see for example Barbour, Hoist & Janson (1992), Theorem I.l.C. Hence, if \N^^Mi^ — ^\ < 
Ar-7/16 + and (n < Ar-V2(^ ^ ^jys/sj samples from Po (/x) and from Po (Mn/N) 
can be coupled so as to be identical, except on an event of probability of order 0(A^~^/^). 
This proves the lemma. I 

We now proceed to the comparison between the age distributions Gn and G. We 
assume henceforth that N > ni, where 

n, := [4(1 + ^)2], (2.20) 

so that, on An, (n < ^, and thus > ^Nfi ii N > ni. Recall the cr-field J-"^ from 

(EH- 
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Lemma 2.2 If N > rii, there is an event A*^ e having P[(A^)^] = 0{N-^/^) such 
that, for suitably chosen sn = 0{N~^^^), we have 

F[\G-^\U)-G-\U)\>^n\A%] < 7]^, 

where rj^ := ip^ + 2eN- and tpN '■= 2ej\fG^^{l — ej\f), and where U ~ U[0, 1]. Note that 

oo if G has finite second moment. 

Proof: We begin by using the Dvoretzky-Kiefer-Wolfowitz inequality, in the form given 
by Massart (1990), which shows that 



¥[^/M^isnp\Gm,ki^)-Gikix)\> z] < 2e 



for any ^ > \j\ log 2 and any A;, /. Taking := -\/21oglV, it follows that 

P[(^jvz,fc)1 < 2iV-^ (2.21) 

for each /, A;, where Ajv;,fc '■— {VMm sup^, \Gm,k{x) — Gik{x)\ < zn} G J-"^. Observe that, 
for all X, 

\Gn{x)-G{x)\ < 5^ ^|G'iVLfc(x)-Gi,(x)| + |M^iM;vi-Pi|G;fc(x) 

1=1 k=i ^ ^ 

+ ^ J2 ^^^^+ Yl ^P^- (2.22) 

Now, by the Chernoff inequalities (Theorem 2.3 in McDiarmid (1998)), we have 

PPDI < />0, (2.23) 

where Aj^i := (IMat; — | JAf|p/| < 41ogA^(l V y/Npi)] G J"^, and, by Markov's inequality, 

P[(^ir)1 < A^'^^ Yl - ^"'^^E{z/(1)2}, (2.24) 

i>[Ari/3J 

where A% := {E^>l^V3j IMm < N'^'} G . 
Define 

{Livi/3j I \ f Livi/3j ^ 

n n^kJn n 
1=1 k=l j \^ 1=1 J 

then F[{A*j^Y] = 0{N~^/^), by LemmaEH ([Ml]), <UMj and flCTj) . On A^, from fl2:22|) . 
for all X > 0, we have 

LAri/3j 

|Gjv(a;) -^(x)! < 



V /|^^v/2b^ + 4M^MogiV{lV 



+ 



l-/ii 



N\ 



+ -— + iv-^/^EMi)^ 



12 



To justify the order of the bound, note first that, from fl2.23p . on A*j^, 

f 8 log AT, Npi < 1; 

Mm < I Slog Ny/Wi, l<Npi<{AlogN}^; 
[ 2Npi, Npi > {AlogNY, 

and then J^kin'^/'^] ^ — ^"^^^ ^ > i^/^ for N >ni on An and, by Cauchy-Schwarz, 



i<[Afi/3J 



Finally, 



N\ 



M, 



N 



N 



M. 



N 



on An, for N > ni. 



Now, since G{x) — ^at < Gn{x) < G{x) +6^ for all x > 0, it also follows for all y that 
G~^{y — En) < Gj/^{y) < G^^{y + En), and thus that 

\G-^\y) - G-\y)\ < G-\y + en) - G-\y - Sn). (2.25) 

Hence it follows that, for any > 0, 

r ' \G-^\y)-G-\y)\dy < f \G-\y + s n) - G-\y - e n)} dy 
Jo Jo 

< / G-\y)dy < 2eNG-\l-7] + eN) = i^l- 

J l-'q-EN 

Taking 77 := 2eN, this shows that, for U uniformly distributed on [0, 1], 

n\GN\U)-G-\U)\I[U <l-2eM]\A*j,} < 

and Markov's inequality completes the proof. Note that, since G is assumed to have finite 
second moment, — G{x)) = o(l) as x — )■ cxd, implying that eNG^^{l — En) = o{e]^'^) 
as — !■ 00. ■ 



Corollary 2.3 Let A*^ be as in Lemma \2.2[ If G satisfies Assumption 2 with h < 00, 
then, on A*^^, 

sup \G]^^{u) — G^^{u)\ = 0(1) as N ^ 00; 



0<u<l 



if G satisfies Assumption 2 with b = 00, then 



sup \Gj^^{u) — G ^{u)\ = 0(1) as N ^ 00, 



for xjN such that N "xn is bounded below as N 00 for some a > 0. 
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Proof: For the first part, let the support of G be [a, h\. Then for any 5 > 0, with fl2.25p . 

\Gjl{u) - G-\u)\ < 26+ 

where En = 0{N-'^/^) is as in Lemma [2l2l and g-[c,d] := mic<x<d g{x). So take 6 = 
^Af — ^ in such a way that En = o{g-[a + 6n, b — Sn])- 

For the second part, for large enough that k{xo + 1)^^ > En, define xni > xq such 
that {xni + 1)''' = k/EN, and choose any xn < Xni- Then, uniformly for all u such that 
a + 6 < G^^{u) < xn, 

G {u+En)-G (m) < — — ^— — — — T— T < 



m.m{g^[a + 6,xo],k{xN + ^} g-[a + S,xo] k{xN + i) ^ 

So choose 6n = e]^^ and xn = {k^N / EnY^'^' — 1 < xni, and 5^ such that En = o{g-[a + 
(5^,xo]); this gives 

sup \G-^\u) - G-\u)\ < —r^, - + 25n + 5'^ ^ 0, 

and xmN^^/^^'^^'> is bounded below as N ^ oo. ■ 

We also need to know that paths of a given length cannot contain too many births. 

Lemma 2.4 Suppose that \im.^^QG{E) = 0. Then there exist t^, > such that all indi- 
viduals of generation n in Z are born after time nt^,, except on an event of probability at 
most 2e~". 

Proof: Let Z„ denote the number of individuals of generation n in Z, starting with Zq = 1. 
Then EZ„ = /x", and so P[2'„ > {e/i}"] < e~". Now the time elapsed up to generation n 
along any given line is a sum of n independent G-distributed random variables, and the 
probability that fewer than n/2 of these are greater than a given value e is the binomial 
probability 

Bi{n,p)[\n/2],n] < {1 + - l)}^-^'^/^! < (4p)"/^ 

with Zp := (1 — p)/p and p = G{e). Hence the probability that, up to generation n, any 
line takes less that time En/2 is at most 

e"" + exp{n(log/i + 1) - log(l/4G'(£))}. 

Taking e > such that log(l/4G'(£:)) > 2(log/i + 2) makes this probability at most 26^", 
and taking t^, := e/2 proves the lemma. ■ 
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2.3 Controlling the ghosts 



We now need to control the differences between the epidemic and branching processes; we 
need to show that ghosts play no significant part. We begin with the forward branching 
process Z. Recalling from fl2.4p that W{t) := 5'(t)e~'^* — )■ W a.s., we write e^y := 
sup^EW^()f:) < oo. Label the individuals of Z independently and uniformly from [A^] in 
order of birth epoch until time r^v; let Lif) denote the number of times that a label has 
been used before, creating an initial ghost, and let L^{t) > L{t) denote the number of 
initial ghosts and their descendants whose birth times have been determined by time t. 
Finally, let t% := aX'^logN, a > 0. 

Lemma 2.5 Under the above assumptions, 

F[{N-'/^L+{tn) > iV-i/^} n {W{tn) > AT-i/S}] = 0{N-'/HogN). 

Proof: For any of the first [\/N{fi + 1)J indices chosen, the probability that it is a repeat 
of an index chosen earlier is at most N~^^^{fi + 1). Hence, for any a > 0, writing T := t%, 

E{L+{TNAt%)} < (/i + l)Ar-i/2E|^ fxewe^^^-'^ B{dt) 

since an individual born at t has an expected number of descendants at time T of at most 
eiyc^^^"*-*, for each of which the expected number of offspring whose births are still to 
come is at most n. Hence 

E{L+{TNAt%)} < {fi + l)N'^/^fiewe^'^ES^B{T)e-^'^ + X e-^'B{t)dt 

< (/i+l)iV-i/V^(l + AT)e^^. 

Thus, choosing a = (1 + e)/2, we have 

F[{N-^/^L4tn) > N-^/^+'} n {W{tn) > N~'/^}] = 0{N-'/HogN), 

since < tj^^^"*^^ when W{tn) > A^~^/^, and the lemma follows by taking e = 1/4. I 

For the backward branching processes Zj^f and Z, the argument is a little different, 
because the identities of the individuals (even if not their labels) are implicitly recognised 
during the construction of the branching process Zjy; the choice of a particular value 
from Gjv inay well determine the choice of the individual in J^r that gave rise to it, and 
will certainly do so if the distribution G is continuous. Hence, when constructing Zjy, an 
initial ghost appears when the same birth time t]sfj,i is sampled from the same individual j 
for the second or subsequent time, and individual j is represented more than once (but 
without creating ghosts) if several distinct elements of {tNj,i, 1 < ^ < j} are sampled. By 
Lemma [2. H the branching process Z has the same offspring numbers as Z^ up to [A^^/®J 
with probability 1 — 0{N~^^^), and individuals can also be identified starting from a 
realization of Z, by using the quantile transformation to go from a value sampled from G 
to the corresponding value from Gn (with an arbitrary rule for distinguishing individuals 
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that give rise to identical birth times). Thus the ghosts arise during the joint construction; 
afterwards, labelhng is at random without replacement from Jn for the distinct individuals 
in Z up to time [N^l^\ . 

As before, we note that W{t) := B{t)e^^^ — )■ W a.s. as t — )■ oo. We can then write 
:= sup^E{iy(t) | -B(O) = 1} < oo (if the process is started with -B(O) = 2, as from K 
and K', the supremum is doubled). We let L(t) denote the number of initial ghosts that 
have arisen by time t, L^{t) > L{t) the number of initial ghosts and their descendants that 
have arisen by then, and L^'^\t) the number of individuals represented at least twice by 
time t. We also denote by L^{t) the number of marked individuals and their descendants 
up to time t, if individuals are marked independently with probability 9. 

Lemma 2.6 Let K and K' be independently chosen at random from J^, and let rj'^ : = 
r]j^ log N, where r]j^ = o{N~^^'^^) is as in Lemma \2.^ Then, conditional on A*j^, and 
starting the branching process Z either from K or from both of K and K' , we have 

(1) P[iV-i/%(t;v(u))>iV-3A6|j-+ nA^] = 0(iV"V8logiV); 

(2) ^N~^'^L^^^\tN{u)) > N-"^'^""] = 0{e{N)N^I^^ log N), 

uniformly for all u < (log A^)/48. Furthermore, there is a set A'^ G J-"^ with FKA'^Y] = 
0(Ar-i/24) ^^^^ ^^^^ 

(3) F[N-^/^L^^\tNiu)) > I p ^/^j ^ 0(Ar-i/24), 

uniformly in the same range of u. 

Proof: The first and second statements of the lemma are proved in much the same way as 
Lemma [2.5[ For the first, we note that the probability of the r-th individual born being 
an initial ghost is at most (r — 1)/Mjv. Hence, for any w > Q and N > ni, 

E{min{L+(t^(n), f (5, we^'^^^^))} \ J^^^ n A^} 

J ev^e^^*^ B{dt) \ , 
< 2/i-iwe2«e^(l + u + ilogAr), 
where f{B,v) := inf{t: B{t) > v}. Thus, and from Lemma [2. ![ 

F[{N-'/^L+{tj,{u)) > Ar-l/2+5e/4| p j^e/2^Xt^{u)^ > | ^+ ^ 

= 0(A^"3"/V"logiV). 

Since also 

P[r(5, Ar^/'e^*^^")) < tN{u)] = F[B{tj^{u)) > Ar^/2e^*ivW] < N'^^^e^, 
it follows that, for u < l^logA^, 

F[N-^/^L+{tN{u)) > iV-l/2+5e/4 I ^ ^ 0(Ar-^/2 log A^), 
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and the first statement follows by taking e = 1/4. 
For the second, we have 

^ f ptNiu) ^ 

< ^e"e|,(l + M + ilogiV), 

and the statement follows from Markov's inequality. 

For the third, we begin by noting that the choices of individual in Zn after n have 
been examined are multinomially MN (n; {i/(j)/MAr, j G Jn}) distributed, so that the 
mean number of individuals that have by then been chosen more than once is at most 

Let A% := {Y,i>iMni{1/MnY < 2N^-^} E J"^, and suppose that iV > m as in (12:2(1 . 
Observe that, since Mat > ^N^ on A^, and 

we have = 0{N-^) for any e < 1/8. Then, using (|2:26|) . 

E{L^''\t)I[W{t) < N']\A%} = E{B^\t)I[B{t) < N'e^']\A%} < A^-^+^^e^^* < A^^^ 

uniformly in t < (l/2A)(l + £:) logiV. Hence, and since P[iy(t) > A^^] < e^A^"^ it follows 
that, for u < ^elogA^, 

F[N-'/''3^\tN{u)) > N^'-^/^ \ A%] = 0{N-'), 
giving the third assertion if we take e = 1/24. I 

We now use L{GN,t) to denote the number of individuals in Z, together with their 
descendants, up to time t, for which the sample taken from G to determine their birth 
time is such that the difference between it and the corresponding value obtained from Gtv 
by the quantile transformation exceeds the threshold ipN defined in Lemma 12.21 Note 
that, on A*^, the expected contribution to L{GN,t) resulting from the offspring of an 
individual born at time f < t is at most fir]]\je^e^^^~~'"\ where 1]^ is as in Lemma [2^2! The 
proof of Lemma [2.6( 2) then yields the following corollary. 

Corollary 2.7 In the setting of Lemma \2.6[ withrj'^ = r/ArlogiV and tin as in Lemma WT^ 

we have 
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2.4 Main theorem 



We now combine our previous results to prove the main result of Section |2l For any t > 0, 
let SN(t) denote the set of individuals in the epidemic that are still susceptible at time t, 
and write S^lt) := |iSAr(t)|. Then, for independently and randomly chosen K and K' 
in [iV], 

1 ^ 

E{N-'S^{t)\J^^J = _^P[A:G5^(t)|J-+] = F[KeS^{t)\J^^J, 

k=l 

and similarly 

Var{N-'S^it)\T^J = ¥[{K, K'} C S^it) \ T^J - {¥[K e S^it) \ T^J^, 

and we use these expressions to show that N'^Sj\[{t) is close to its expectation, and to 
give an asymptotic expression for it. 

At time rjv, the epidemic process has generated a collection Un of individuals, whose 
birth times, the elements of Vj\f{T]\r), are determined, but have not yet occurred, and which 
have not yet been labelled (so that some of them may turn out to be ghosts); labels are 
assigned to them independently and at random from [A^], and ghosts are then removed, 
leaving a labelled set U'j^ C U^. 

A randomly chosen individual K samples an independent copy of the reversed branch- 
ing process Z, and uses it to determine its susceptibility process, by way of Z^. For 
times to infection, as measured in Z-time, not exceeding t^iu + h), there is a corre- 
sponding susceptibility set Un{u + h), consisting of distinct individuals. The elements of 
the set Un{u + h) are now assigned labels, chosen independently but without replacement 
from Jjq. Let £p^{u+h) denote the set of elements of U]^{u+h) that share labels with mem- 
bers of [/at. Then Ejq{u+h) := \£jq{u+h) \ has conditional expectation \UN{u+h) \ \Un\/N. 
If E]s[{u + h) = 0, there is no path of infection from ii to K of Z-length less than Tj^+u + h. 
If En{u + h) > 0, go through the elements of £n{u + h) in order of increasing Z-time, and 
mark all their progeny in UN{u + h) as ghosts, since these elements are also represented as 
members of Un, and their infection pre-history has already been determined in J-'^j^- Let 
£n{u + h) G £n{u + h) denote those elements of Sn{u + h) that are not marked as ghosts, 
and write E'^{u + h) := \£'j^{u + h)\. For any element e of S'^^u + h), let + v denote 
the birth time of the corresponding element of t/jy, let a denote the birth time in Z of 
the element of S^iu + h), and aj\f its corresponding birth time in Z^. Then e gives rise 
to an infection path from to of length + v + a^. If this is less than or equal to 
Tn + tN^u) for any e, then K ^ Sn{tm + tAr(u)); otherwise, K G Sn^tn + tM{u)) unless, 
possibly, there is an infection path with v + aN < ^7v(m) but v + a > tN^u + h). Using these 
considerations, we can deduce an approximation for F[K G Sn^tn + t]\f{u)) \ ^^], and a 
similar argument, with two reversed branching processes, leads also to a corresponding 
approximation to P[{A', K'} C Sn^t^ + tiy{u)). 

The proof of the theorem that follows is essentially concerned with quantifying the 
above steps. In particular, it is to be shown that £n{u + h) = £'^{u + h) with high 
probability, and that \Un{u + h) \ \Un\/N ~ (/i — l)We'^^^. Then, for any element e 
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of S'^{u + h), we need to show that the corresponding v is sampled from a distribution 
close to F, as defined in (12.61) . and that tN^u + h) — ctn is sampled from a distribution 
close to the exponential distribution Exp(A) with mean 1/A, in view of (I2.16p . Assuming 
that this is the case, it follows that 

f[v + (Tiv < tN{u)\ ~ e"^ / \e-^'F{s) ds 

Jo 

Xse'^'G{ds). (2.27) 



-h \ „„-As , 



The conditional mean number of such events is therefore asymptotically We'^m^, where m^, 
is given in (12.31) . and a Poisson approximation shows that the probability of none of them 
occurring is close to e"'^'^"™-*. The required approximation to F[K G Siy{T^+tiy{u)) \ J-"^] 

is then Eje"'^*^"'"*}. Finally, the possibility that there is an infection path with f + (Tat < 
tN{u) but V + a > tN{u + h) has to be excluded. 

Theorem 2.8 Under Assumptions 1 and 2, there exists an event An G J-"^ such that 
f[A%] as N oo, for which 



P 



sup|iV ^Sn{tn + >^'^{k^ogN + u}) - s{u)\> e 7"+ fl A^v fl {rjv < oo} 



^ 



as N ^ oo, where s is the decreasing function given by 

s{u) := E{exp{-I?e"m4}, 
and where = fiX se"'^'* G{ds), as in h2.3\) . 

Remark. It therefore follows that sup„ |A^~-'^S'7v(A~^{log — logiy + u}) — s{u)\ -^d 0, 
conditionally onW > 0. However, in practice, it may be more reasonable to expect to be 
able to observe the time tn than it is to know the value of W, or, equivalently, when the 
first infection occurred. 

Proof: By Lemma 1^31 and Nerman (1981), Corollary 5.6 with 0i(t) = ^(t, oo) and (f)2{t) = 
1, and using the fact that N~^^'^B'{tm) — )■ 1 a.s. as — t- oo, we obtain that N~'^/'^\Un\ — ^ 
(/i — 1) a.s. as A^ — t- oo on {W > 0}; Lemma [2751 shows that excluding ghosts has negligible 
effect on the branching asymptotics. Thus we can define a set 

A% := {\N-'/'\UM\-{fi-l)\<Vi{N)} G (2.28) 
where r]i{N) and F[{A%Y] ^ as A^ ^ oo. Let 

Am := A%nA%nA%n{W{TM)>N-'/^}. 

We wish first to show that, for any m G M, 

F[K eS{TN + tN{u))\J^^^nAM] ~ siu), 
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where s is as stated in the theorem. To do so we proceed as outhned above. On A^, 
we have \Un\ ~ N^^^ifi - 1), in view of (K28^ . Then, by (12J[5|) and Lemma [2lll,3), 
\UNiu + h)\ ^ jV^/^e*-"^'*''iy; Lemma 12.61 shows that excluding ghosts and individuals 
multiply referenced has little effect on the branching asymptotics. The mean number 
of individuals in Un{u + h) that share a common index with a member of Un is thus 
asymptotic to 

We now show that P[^7v(?^ + h) 7^ 8'^{u + h)] = OiN^'"^/^^). Letting E^{u + h) denote the 
number of descendants of £n{u + h), it follows from Lemma [2.6( 2). by taking 6 = 6{N) = 
N-'^\Un\ and in view of (E2HD, that 

F[E§{u + h)> AT^/ie I jr+^ n An] = 0{N~^'^ logN). 

The conditional probability that any of them is marked by a label from [/tv is thus at 
most of order 0(iV-i/2+5/i6 ^ N-^l^\ogN) = 0{N-^/^^). 

Now, because of the random scheme of assignment of labels, any pair in £n{u + h) 
is associated with a random choice of elements v of VmIt^) and a of A{tN{u + h)), and 
the empirical distributions of the elements of these sets converge, as observed in (12.51) . 
fl2.6p and fl2.16l) . Furthermore, the empirical distribution F^'^'^'^ of the birth times in 
corresponding to the elements of A{ti\i{u + h)) also converges to the exponential Exp(A) 
distribution with mean 1/A if > 0. To see this, we argue as follows. Recalling fl2.16p . 
let ri2it) be such that limt^ooV^it) = and that 



P 



s>o B{t) ^ 



< si 



(1 



w >0 



(2.29) 



Then define 



k :-- 



l + e- 
2XU 



where t^, is as in Lemma [2.41 Observe that, in view of Corollary 12.71 and of Lemma 

sup \fI^\s) - (1 - e-^^)| < A^^vHogiV + //^(tivM) + N'/\r^'^y/y\A{tN{u))\ 



on {W > 0}, uniformly in -u < ^erlogA^, except on a set of conditional probability at 
most (vnY^^ + 2A^-(i+=)/2At* ^^2(tiv(M)), and that sup^ e-^*|A(t)| < 00. 

At this point, we also need to exclude the possibility that there is an infection path with 
V + ajy < t^iu) but V + a > t^iu + h). Corollary 12.71 shows that, on A*^, the probability 
of having a path from K to Un containing a sample f from G such that |f — fjv| > ^/'at, 
where ■= Gj/^{G{f)), before time A~"'^(l/2 + e)logA^ is small for e < 1/24, and the 



number of births in a path up to that time is bounded by c log in view of Lemma 12. 4[ 
with high probability. Hence there has to be at least one pair (f, f^) in the path such 
that f — tn > c', for c' = (l/2cA)£:, if m < |A~^e log N and a — > A~^£ log N — u. But 
this cannot be the case, for large enough, in view of Corollary 12.31 
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Hence, on the event Ajy, and conditional on J-'n{u) := a{Z(t), < t < tAr(u)) V-^-^! 

the mean number of pairs with common index, one from 11^ and one from Un{u + h), 
that are not ghosts and give rise to an infection path between ii and K of length at most 
tn + t]\r{u), is given as in f l2.27p and fl2.3p by 



m^^u, W) 



of course, the asymptotics are valid also when W = 0. Let Inj^u) denote the indicator 
of the event that the label of the j-th element of Un is matched with one of the labels 
assigned to Un{u), 1 < j < \Un\- Then, conditional on J-'^j^\ {lNj{u), 1 < j < \Un\) is 
a collection of independent indicator random variables, each with probability PNiu) := 
\UN\~^rnN{u,W)] hence it follows by Barbour, Hoist & Janson (1992, (1.23)) that 



\Un\ 

p[^/^,(w) 



exp{— mjv(M, W)} 



(2.30) 



Thus we deduce that 

~ E{exp{ 
But this means that 

Sn{u) := E{iV-i5^(rjv + A-H^logiV + M})| J-+ nljv} 



siu 



N 



S[U 



(2.31) 



also. 

The argument for approximating the probability that both K and K' belong to 5Ar(rAr+ 
tN{u)) runs in much the same way. The limiting random variable for the branching pro- 
cess Z starting with two individuals can be expressed as Wi + W2, where the two are 
independent copies of W, and the sizes of the corresponding sets U^^ (u) and U^^ (u) are 
asymptotically N^/'^Wie^ and A^^/^W^2e" respectively. We write LNj{u) = (1,0) if the 
j-th element of Un is matched with a label associated with U^\u), and (0,0) otherwise; 
similarly, L^jiu) = (0, 1) if matched with a label associated with U)^ [u] and (0,0) oth- 
erwise. Then both K and K' belong to Sn{jn + tN{u)) if X]j=^' -^A^il"") = (0,0). The 
multivariate analogue of the Poisson approximation ( I2.30p (Roos, 1999, Theorem 1) gives 

exp{-mAr(M, Wi) - mN{u, W2)} 



-^ir^ n An 



i=i 

< c\UN\''{mNiu,Wi) + mNiu,W2)}, (2.32) 
for a universal constant c. Hence, as before, 

F[{K, K'} C SNiTN + tN{u)) I j;+ n An] 

r^E{exp{-mN{u,Wi)-mN{u,W2)}\J'^^nAN} ~ {s{u)V, (2.33) 
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by the independence of Wi and W2- But the joint probabihty can also be written as 

E{ {N-'S^iTN + A-^i log N + u})Y I J-+ n I^}, 
so that it follows from fl2.3ip and fl2.33p that 

VaT{N-'SN{r^ + X~'{l\ogN + u})\J^+^nAN} ~ 0. (2.34) 



It now follows, by a standard argument, that, for any e > 0, conditional on J-"^ fl A 



N, 



P 



sup iN-'^SNiTN + A"^{i log + u}) - s{u)\ > £ 







as — )■ oo, and the theorem follows. I 

Because of the factor A~^ in the definition of t^lu), the quantity s{Xt) should match 
the solution s(t) of (11. ip . To see that this is so, note that, by considering the possibilities 
for the offspring of the first individual in Z, 'il>{9) := E{e~^^} satisfies the equation 

i:{9) = exp|-/i^ {l-ij{9e'^'"))G{dw)Y (2.35) 

Substituting 9 = m^.e''^*, writing 

s{t) = s{\t) = ?/'(m,e^*) (2.36) 

and taking logarithms recovers equation (II. 2p . with fiG{du) in place of P{v)dv. As 
for (II. 2p . equation (I2.35P has many solutions, since, if ip{9) is a solution, so is ipa{9) := 
ip{(y9), for any fixed a > 0. The condition ip{0) = 1, equivalent to s(— oo) = 1, is sat- 
isfied by all ipa- The relevant choice of solution to (I2.35P is determined by matching 
KW with —ip^O), or, in terms of (II. 2p . with (m*A)~^ limt^_oo e~'*'*(— -Ds(t)). A renewal 
equation for E{_B(t)e~'^*} gives the solution as 



EW = ^limE{5(t)e-^*} = ve'^" Gidv)"^ 



1 



by the key renewal theorem. Thus Theorem 12.81 can be interpreted as a formal justi- 
fication of the stochastic basis for the Kermack-McKendrick epidemic as described in 
Metz (1978), Section 4, under assumptions that are slightly more general, in that the 
point processes ^ are not required to be doubly stochastic, but are in some respects more 
restrictive as regards the choice of /3. Since is identified as the Laplace transform of a 
probability distribution, it is an analytic function in ^{9) > 0, which, with (I2.36p . proves 
Conjecture (f) in Metz (1978), p. 120. 
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3 Refinements 

3.1 Multitype epidemics 

Very similar arguments can be carried through for epidemics in populations consisting of 
individuals of more than one type. Suppose that there are a finite number d of different 
types, with Ni individuals of type l-, 1 < I < where Ni G {[A^ttjJ, [A^tt/]}, X]f=i = 
and ^^^^ vt; = 1. Assume that type / individuals have independent and identically 
distributed point processes 1 < i < A"; , on [rf] x ]R_|_, with mean measures 



in the population, just as in the single tj^e case, by beginning with a multitype branching 
process constructed from independent realizations of the , 1 < / < c?, and then using 
random labelling within the members of each type to determine which transitions are 
to be retained in the epidemic process. The approximation arguments are very much as 
before. Asymptotically exponential growth and the analogues of (12. 5 p and (12.16^ . together 
with an asymptotically stable type distribution, hold in Li in the multitype setting. The 
asymptotic statements that we use in this section are all justified by Theorem 7.3 of 
Jagers (1989), who proves Li approximation for a wide variety of characteristics of the 
branching process in an even more general setting. 

Remark. It is perhaps more natural, especially when comparing the spread of the same 
epidemic in populations with different compositions of types, to assume a fixed value for 
the measures aik{du) := nikGiki^du) juk, rather than supposing that f^ikGik remains the 
same for all A^. The quantity aik{du) can be interpreted as representing the infection 
intensity measure of contacts with type k individuals made by a type / individual, in a 
population consisting entirely of individuals of type k. At least in Poisson process contact 
models, this would suggest taking E{^|''^-'(A;, du)} = aik{du)Nk/N in a population of the 
composition given above, implying that Gik{du) = aik{du)/aik{M.+) is fixed for all A^, but 
that = aik(^+)Nk/N may vary with A^. This differs from (13. ip inasmuch as Nk/N 
is not exactly equal to tt^. As in the single-type model, this minor difference entails no 
change in the theorems that we prove. 

We now assume that the matrix /z is irreducible, and that the distribution functions Gik 
all satisfy Assumption 2; suppose also that the largest eigenvalue of /i is larger than 1, 
and write 



Then the branching process has as Malthusian parameter the value A > for which /u(A) 
has largest eigenvalue 1. We write and t] for the positive left and right eigenvectors 
of /i(A) associated with eigenvalue 1, normalized such that C^l = ('^rj = 1. Let B'{t) : = 
{B[{t), I < I < d) denote the numbers of individuals of each type born up to time t. 
Then, if the branching process starts from a single individual of type i, 




(3.1) 




Then an epidemic process can be constructed 




B'{t)e-^^ W'^'^C in L 



1 
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as t — )■ oo. Here, W'--^^ is a random variable whose Laplace transform tp^'^^s) := E{e ■^^^'■'j 
satisfies the implicit equations 

ij^^\s) = E jexp ^ \ogij^''\se-^'")^^^\k, dv)j | , 1 < / < rf, (3.3) 
with EVr(^) = r^i/ml^^ and 

m« := AC^(-D^(A))r^; (3.4) 

note that 

/■oo 

{-Dfi{X))ik = fiik / ue'^"" Gik{du), 
Jo 

and that m^^ /X is the multitype mean age at child bearing (Jagers (1989), p. 195). Letting 
VJ'(t) denote the set of times until birth of the unborn type / offspring of individuals born 
before t, it follows also that 



with 



Cl 



e-^'\Vi'{t)\ W^'^ci in Li, (3.5) 

y^Ckm {l-e~^'')Gu{dv) 
k=i 

d d 

'^Ckifj'ki - fj'kiix)) = ^Ckm-Cu (3.6) 



fc=i fe=i 



and that, on W^'^ > 
where 



sup 

s 



\V/{t)n{s,oo)\/\V:{t)\-{l-F,{s)) ^ 0, (3.7) 



l-Fi(s) := c 



/-oo 

' 5Z Cfc/ifci / (1 - e-"^'^-^)) Gkiidv) , (3.^ 

1 1 Js 



I 

k=l 



replacing (12.51) and (12. 6p . 

The backward branching process is similar, but has Poisson point processes ^'•^^ with 
intensity fikiGki{du) at {k,u) G [d] x ]R_|_. The matrix /i(s) is given by /i(s)'^, so that the 
Malthusian parameter is still A, but the left and right eigenvectors at A are swapped; the 
normalized versions are ('^ := rj^ / H and fj := HC,^ where H := Yl'k=i'^k- The backward 
random variables W'^^^ := lim^^oo ^^^^^ -BA:(t) corresponding to the initial conditions 
1 <l < d now have means fji/m^^^ = HQ/mi^\ and their Laplace transforms ■^/''-'^ satisfy 
the equations 

7/>«(s) = exp l-Yuki l^\l-iJ^''\se-^n)Gki{dv) ], 1 < I < d. (3.9) 



-V/^fc/ / (1 - 

k=l -^0 



24 



As in f|2.16p . the empirical distribution of the ages at time t of /-individuals born before t 
also converges in Li to Exp(A). 

Now suppose that the forward branching process starts with a single type i individual. 
Define := inf{t > 0: Eti ^K*) > Lv^J), so that Vr»e"^ ~ ViV as iV ^ cx), 
from (13. 2p . and |V5'(Tjv)| ~ Qy/N, 1 < I < d, from (13. 5p . Then run the backward branching 
process starting with a single type i' individual; at time tiy{u) := A~"'^(|logA^ + u), as 

in (12. 7p . we have B{tN{u)) ~ y/NW^'^'^e^'^(. Hence the mean number of pairs consisting 
of one element v of Vi{tn) and one type I individual w born before t^iu) in the backward 
branching process, such that v is less than the age of w at tN^u), is asymptotically given 
by 

POO 

{Q^iV} {\/iW('')e^%} / Xe-^'Fi{s)ds 

Jo 

poo 

= W('')e^Xz / At;e-^^V^HaG'H(rft^). 
•^0 k=l 

Thus, when the individuals corresponding to the V/^tn) and the type / individuals in the 
backward branching process are randomly labelled in constructing the epidemic process, 
the mean number of such pairs that have the same labels is asymptotically given by 



poo '■^ 
"^0 k=l 



and hence the probability that there is no such pair of any type /, 1 < / < rf, is asymp- 
totically given by exp{— iy*^*'^e'^"ml^''}, where 



poo ^ ^ 

/ A^;e-"''^ (3.10) 
Jo ,_i 



k=l 1=1 



Arguing as in the case of a single type, we have the following theorem, in which J-"^ 
denotes the precise analogue of the cr-algebra having the same name in the single type 
case, and S^iit) is the number of type / susceptibles at time t. 

Theorem 3.1 Suppose that the multitype forward branching process is supercritical and 
has offspring distributions with finite second moments; suppose also that Assumption 2 
holds for each Gik- Then there exists an event A^- G J-"^ such that P[v45^] as N oo, 
for which 



P 



svip\{Npi)-^Sm{rN + ^'Hk log AT + u}) - si{u)\ > e n A^ n {t^ < oo} 



^ 



as N ^ oo, where si is the decreasing function given by 

siiu) := ^«(e"ml2)), 

where the tp^''^ satisfy liS. 9\) with —D-ip'^^^Q) = HC,i/rni'\ and where rni''^ is defined in 1^3.4 
and m^^ in Ii3.10\) . 
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3.2 A configuration model 

In this section, we consider a different model of epidemic spread. In those considered so 
far, an infected individual chooses to infect a number of randomly chosen individuals, and 
the individuals chosen are not taken into account in this choice. Now we suppose that pairs 
of individuals are either acquainted with one another or are not, so that acquaintanceship 
determines a graph on the set of individuals, and we assume that infectious contacts can 
only be made between graph neighbours. This yields a more symmetric description of the 
contact process, and, as a result, the forward and backward branching approximations 
can be expected to look more similar. We shall, for simplicity, assume that there is a 
finite, A^-indcpcndcnt upper bound K on the number of acquaintances that an individual 
may have; note that this immediately rules out any Poisson distribution of offspring in an 
approximating branching process, so that the backward branching processes from such a 
model have to be different from those in the previous sections. 

To make further progress, we assume that the acquaintanceship graph is nonethe- 
less rather randomly constituted within the population, according to the following con- 
struction. We assume that A^^ members of the population are 'type /c' individuals, who 
have exactly k acquaintances, with Yl^=i^k — N and Nk G {[A^'TTfeJ, [A'"7rfc]}, for fixed 
TTi, . . . , TTx, and with M :— X^^i even. Think of a type k individual as having k 
half-edges, and join the half-edges into edges by means of a random matching of the M 
half-edges, determining the acquaintanceship graph. This graph may have some loops 
and multiple edges, but they are few, and we shall ignore their effects. Thus the method of 
assigning which individuals are acquaintances remains essentially random, but the propen- 
sities of each individual are respected when determining whether they are acquainted or 
not. We then assume that an infected type k individual makes contact with a given type I 
acquaintance at a random time after infection that has (possibly defective) distribution 
function Gki and is independent of all other contact times; we suppose also that a type k 
individual remains infectious for a random time with (possibly defective) distribution 
again independently of everything else. If we specialize to the case where the distribu- 
tions Gki are all identical and equal to Exp(«), and that the $fc are all identical and equal 
to Exp(/3), then the model of Volz (2008) (in the case of a finite number K of possible 
contact numbers) is recovered. 

As in the previous models, the key effort lies in determining the probability that an 
initially chosen individual infects another randomly chosen individual before a specified 
time t. To do so, construct the association graph by starting from the initial individual as 
root vertex, and matching its half-edges by random choice from the set of all half-edges; 
then attach the infectious period to the initial individual, and the lengths of time to 
potentially infectious contact to the edges. This yields a set of infected vertices, together 
with the times of their infection, some of which may be infinite. Now continue by matching 
the remaining half-edges associated with the first of these vertices (if any) to be infected, 
attaching the infectious period to the chosen vertex, and adding the lengths of time to 
potentially infectious contact (infinite, if longer than the infectious period) for each edge 
to the time of infection of the chosen vertex, so as to yield the times of infection of newly 
infected vertices; this augments the set of infected vertices. Proceed in this way, always 
choosing for development the infected vertex with unmatched half-edges that has the 
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smallest time of infection, until the first time that either at least [V^J vertices have 
been infected or the infection dies out. In the former case, there remains a set of infected 
vertices whose subsequent contact history has not been explored. If a half-edge is picked 
for a second or subsequent time, ignore the choice and re-sample until a new one is chosen; 
if a vertex is chosen that has already been infected, ignore it for future development. As 
in the previous arguments, for the lengths of time in which we are interested, there are a 
few such repeated samples, but few enough that they can be ignored. 

For the susceptibility graph seen backwards from a randomly chosen individual, carry 
out essentially the same procedure for a specified time; the only difference is the vertex 
to which the infectious period is attached, being that of the child, rather than the parent. 
Half-edges that have previously been used, including those that were used in the forward 
process, are discarded and re-sampled; the half-edges that are associated with the set of 
infected but unexplored vertices from the forward phase are still available for choice, and 
are those that close chains of infection. 

If repeats are ignored, the infection process as seen from the initial individual becomes 
a branching process with K types. In the branching process, a type k individual (other 
than the initial individual) has k — 1 offspring, corresponding to the k — 1 half-edges 
that remain to be connected after a type k individual has been encountered in growing 
the association graph, and each of these is of type / with probability Ipi/m, where m = 
Ylf=i^'Pi'^ chosen from the size-biased transform of the frequency distribution {p[, 1 < 
/' < K). As before, the difference between the process with this distribution and that 
with offspring probabilities INi/M is negligible for our purposes. The type k individual 
also has an infectious period randomly assigned to it from the distribution and the 
times to contact along the different edges are assigned independently from the appropriate 
distributions Gki- This yields an age-dependent multi-type branching process, in which 
times to birth may be infinite (if the sampled time to contact is itself infinite, or exceeds 
the infectious period of the parent), and the times of birth of the descendants of a given 
individual are dependent, because they are finite only if they do not exceed the infectious 
period of the common parent. 

Seen from the randomly chosen individual, the backward branching process is very 
much the same. The offspring distribution is identical, but the infection times of the 
offspring of a given individual, although having the same marginal distributions as be- 
fore, are now independent, because the relevant infectious period, determining whether a 
contact results in infection, is that of the child, and not of the parent. Because the basis 
of the construction is the fixed set of half-edges, the problems that arose in Section \2.2\ 
because the offspring distribution of the backward branching process was not fixed for 
all A^, no longer appear (except for the trivial differences between Ipi/m and INi/M); 
more importantly, choosing the contact times for type k - type / contacts independently 
from Gki and the infectious periods independently from the means that the times to 
birth in the backward branching process have distributions that do not depend on A^, so 
that there is no need for an analogue of Corollary 12. 3[ and hence no special assumptions 
about the tails of the G^i need to be made. Of course, the offspring distributions of the dif- 
ferent types are bounded, so that the corresponding moment conditions are automatically 
satisfied. 

The argument now proceeds much as for the multitype process of the previous sec- 
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tion. Once again, the asymptotic statements for the branching processes are justified by 
Jagers (1989), Theorem 7.3. The matrix /i is defined analogously by 

POO 

fxikis) := il-l){kpk/m} / e-''' {1 - ^iiu))Gikidu) =: (l - l){kp,/m}Uikis), 



say, and we write ^ik := fJ'iki^)', note that fiik need no longer be the expected number of off- 
spring, since Uik{0) is typically less than 1. Because of the factor (/ — 1), /i(s) is reducible. 
Supposing that all the pk and all the Uik{s) are positive, we can write the irreducible 
non-negative matrix fi^^\s), obtained from by removing the first row and column, 
as Dif/(i)(s)D2, where := diag(l, 2, . . . , /sT - 1) and D2 := m-^ diag{2p2, Kpk). 
Assume that the matrix /^^^^(O) has dominant eigenvalue larger than 1, and define the 
Malthusian parameter A to be such that fx^^^X) has dominant eigenvalue equal to 1; let 
(^^^ and r/*^^) be associated left and right eigenvectors. Then the left and right eigen- 
vectors of with eigenvalue 1 are given by ('^ := Z^^^^^^'^ fj.{X)£^^\ (^^^ ) and 
T] := H^^{0,r]^^^ where Z and H are chosen so that ^"^1 = C'^r] = 1; here, e^^^ 
denotes the first coordinate vector. 

Let B'{t) := {B[{t), ^ < I < d) denote the numbers of individuals of each type born 
up to time t; then 

B\t)e-^' ^ Vr«C in (3.11) 

as t — 00, if the initial individual has type i. The distribution of H^i*'' is not quite the 
one that would be expected when starting the branching process with a typical type i 
individual, because the initial type i individual has i offspring, instead of i — 1. However, 
it can easily be deduced from the Laplace transforms {^p^''\s), 1 < I < K) of the limiting 
random variables for the branching process that has all individuals, including the initial 
one, obeying the same rules. These solve a system of imphcit equations that can be 
deduced from (13.31) . Here, the quantity within the expectation in (13. 3p can be written as 

n|(v.«-'>(.e-«)) 

r=l 

where T denotes the infectious period of the type / individual, and K^. denotes the type 
and Vr the contact time of the r-th of his (/ — 1) acquaintances. T and {K^., 1 < r < / — 1) 
are independent, and, given Kr = Vr is drawn independently of everything else from 
the distribution Gik- Thus (13. 3p reduces here to the system 

m-'TkpJ [ ^P^''\se-^'^)Gikidv) + [1 - Gikit)]] \ ^dt), 

(3.12) 

for 1 < / < rf, with -(L)^«)(0) = r/,/ml^^ and 

m« := AC^(-D/i)(A)r7; (3.13) 
the Laplace transform of the distribution of wP is then given by 



E 



L-^wP\ = f [m-'TkpJ [ ^^''\se-''')Gik{dv) + [l-GUt)]]\ ^i{dt), 



(3.14) 
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for 1 < I < d. Here, we have 



(-D/i)(A)zfc = il-l){kpk/m} / ue-^^il-^i{u))Gikidu). 

Jo 

Letting Vi'{t) denote the set of times until birth of the unborn type / offspring of individuals 
born before t, it follows also that, if the initial individual is of type i, then 



e-''\V;{t)\ ^ W^Wq in Li 



(3.15) 



with 



POO 

y^Ckik - l){lpi/m} / (1 - e-^-) (1 - ^k{v))GUdi 

k=l 

K K 



(3.16) 



k=l 



k=l 



Furthermore, on W^^^ > 0, as for (13. 7p and (13.81) . 



£(0 



sup 



\v/{t)n{s,oo)\/\vat)\-{i-F,{s)) 



^ 0, 



(3.17) 



where 



poo 

1-Fiis) := cr'J2Ckik-l){lpi/m} (1 - e'^^^"^)) (1 - $fc(t;))GfcK^^^) • (3.18) 
k=i -^^ 

The backward branching process is similar; we now have 

(iik{s) ■■= {I - l){kpk/m}UM{s), 

once again reducible, with fi^^\s) = DiU^^^^ {s)D2 irreducible. It can be checked that 
the Malthusian parameter is still A. The matrix /i^^)(A) has left and right eigenvectors 
_ ^(1) D^D^^ and fj(^) = D2^Di(^^^^ with eigenvalue 1, and the corresponding left 
and right eigenvectors of fi{X) are given by ^'^ = Z^^{r]^^'''^ D2U'^{X)D2e^^\Q^^'^) and 
r) = if~^(0, 17^^^^)"^, where Z and H are chosen to make ^"^1 = (^"^17 = 1; in particular, it 
follows that ZH = 1, and that the value of m^^ deduced from (I3.13P for the backward 
process is the same as m^^ . The limiting random variable PVi*'' for the backward process 
starting with a single individual of type i, satisfying 



B'{t)e 



-xt 



(3.19) 



once again has a distribution whose Laplace transform ipi^"^ can be found from the solu- 
tions to a set of implicit equations belonging to the backward branching process whose 
individuals, including the initial individual, all follow the same rules. This branching 
process has offspring that behave independently of one another as regards both type and 
time of birth, so that, denoting the Laplace transforms of the limit random variables with 
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the different initial conditions by {tp^^\ 1 < / < K), we have i/j^-^\s) = {''Po\s)y ^, where 



the ?/'q satisfy the equations 



4\s) = m-' TkpJf {4'\se-^'')}'-' (1 - <^k{v))GUdv) + (1 - f/«(0)) 
k=i ^Jio^^) 



K 



k=l 



l-m-'Y^kp, / {l-{4'\se-'nV-')il~Mv))GM{dv), (3.20) 



(0,oo) 



for 1 < I < d. Since (— D-?/'*^''')(0) = fii/mi'\ the side condition for solving f l3.20p is 
(-D4'^)(0) = fill {{I - l)ml^^j^= I > 2, with V'J^^(s) = 1 for all s. 

The Laplace transform ijji^'' of W^i*^ is then given by {V^o*^}*- As in fl2.16l) . the empirical 
distribution of the ages at time t of /-individuals born before t also converges to Exp(A). 

Now suppose that the forward branching process starts with a single type i individual 
(having i offspring). Define tn := inf{t > 0: E£i ^K^) > Lv^J). so that wi'^e^^ ~ 
y/N as N ^ oo, from ( 13TT]) . and |VJ'(r^)| ~ Q^iV, 1 < / < /sT, from fl3J[5|) . Then 
run the backward branching process starting with a single type i' individual; at time 
tN{u) := A"^(i log + u), we have B{tN{u)) ~ TiVWi^'^e^^C Hence the mean number 



■ 2 

of pairs of individuals consisting of an element v of Vi'{tn) and a type / individual if born 
before tisf{u) in the backward branching process, such that v is less than the age of w 
at tAr('u), is asymptotically given by 

POO 

{ciVN}{VNWPe^''Ci} / \e-^'Fi{s)ds 

Jo 

nOO ^ 

= NWPe^^i / At;e-"'^^a(A:-l){/pi/m}(l-<l>fc(i;))G'H(rft;). 
•^0 k=l 

Any such pair is realized as identical individuals in the epidemic process with asymptotic 
probability (/ — 1)/Nlpi, since the element v has only (/ — 1) half edges available to 
be matched, out of a total number of half-edges from type / individuals that is still 
asymptotically Nlpi. Thus the mean number of such pairs that correspond to actual 
matches is asymptotically given by 

K 



Ip, 



l?f )e""6 / Ai;e-"'^ ^ (kik - l){lpi/m} (1 - ^k{v)) Gki{dv) 



and hence the probability that there is no such pair of any type /, 1 < / < c?, is asymp- 
totically given by exp{—W^^ ■'e'^"ml^^}, where 

K K 



I POO 

-Y^y^Ckik-m-lXi / Xve-^''{l~Mv))Gki{dv). (3.21) 
k=i 1=1 Jo 

These assertions, and the analogous assertions about the probability of two randomly 
chosen individuals being infected by the initial individual, can be proved by the meth- 
ods introduced in Section [2], and lead to the following theorem. Here, J-"^ denotes the 
cr-algebra associated with the (forward) infection process until [a/ZVJ infections have oc- 
curred, and Siyi(t) denotes the number of type / susceptibles at time t. 
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Theorem 3.2 Suppose that the forward branching process is supercritical. Then there 
exists an event An G J-"^ such that P[^Ar] — )■ as — > oo, for which 



P 



^ 



snY>\{Npi)-^SNi{TN + A-^{ilogiV + u}) -si{u)\>e n An n {tn < oo} 

- u 

as N oo, where si is the decreasing function given by 

Hu) := (e^ml^)), 

where ipi^^ and ml^"* are as defined above. In particular, the total proportion of susceptibles 
^ni{tn + A~^{|logA^ + u}) is well approximated by J2h=iPi-^i('^) ' uniformly 

in u. 

The general formulation above simplifies, if the distributions $/; of infectious period 
and Gki of contact times are the same for all choices of the indices. In this case, the 
matrix /i(s) is given by 

/■oo 

ixik{s) := {l-l){kpk/m} / e-'^{l-^{u))G{du) =: U{s){l - l){kpk/m}, 

Jo 

and is of rank one. The positive eigenvalue is U {s)m{2) / m, where m(2) '■= Xl^i k{k — l)pk, 
the process is supercritical if m(2)/m > l/t/(0), where f/(0) = /(ooo)(-'^ — ^{u))G{du), and 
A is such that U{X) = m/m(2). The eigenvectors for the forward and backward processes 

are equal, with Q = Q = ipi/m and rji = fji = (i — l)m/m(2). The quantities m^^ and m'^^ 
become 

m''^^ = mQm(2)/m and mf^ = mom^2)/'"^^! 

where mo := /q°° Xve^^'"{l — ^{v))G{dv). The equations fl3.20p can be much more neatly 

expressed, because the functions tpH^ are now the same for all Z, refiecting that the back- 
ward process of half-edges is equivalent to a single-type branching process. They reduce 
to the single equation 



Ms) = m-'TkpJ f {Mse-'n}'"'a-Hv))G{dv) + {l-UiO)) 



{1 - m-'g'iMse-^^mi " Hv))G{dv), (3.22) 

(0,oo) 

where g{s) := Yl!k=iPkS^ i ^'^d the initial condition is (— -D-?/'o)(0) = fji/{{l — = 
m/{m(2)ml"'^^}. To express Si{u) more concisely, we write hs{u) : = 

'?/^o(se'^"); then fl3.22p implies that h := kg satisfies the equation 

h{u) = 1-1 {l-m^^g'{h{u-v))}{l-^{v))G{dv), (3.23) 

i(0,oo) 

with initial condition lim„_^_oo{e^^"(-D /?-)(«)} = As; so Siiu) = h^(2){u) satisfies (13.2; 



with limu_j._oo{e ^'^{Dsi){u)} = Xm^^\ and s; = (si)'. 
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In the case of the Volz (2008) model, there is further simphfication, because of the 
exphcit forms $(f ) = 1 — e'^"" and G{dv) = ae~'^'"dv. In this deterministic law 

of large numbers starting with an asymptotically positive initial proportion of infectious 
individuals was established by Decreusefond et al. (2012). With such an initial condition, 
the randomness inherent in the initial stages of development, reflected in the presence 
of Tjv in the statement of Theorem 13. 2[ plays no significant part. In the Volz setting, 
explicit formulae for A = a;m(2)/m — (5 and mo = \a{\ + a + /3)~^ can be written down, 
and equation f l3.23p can be expressed as 

h{u) = 1-/ {l-m-^g'{h{u-v))}ae-^''+P'^'' dv 



a + (3 m 



oo 



Differentiation with respect to u then yields the following autonomous differential equation 
for h = h{t): 

dh rv ~ 

^ = ^g'{h)~ia + P)h + P = (a + mW-h), (3.24) 
dt m 

where /(s) is the probability generating function {ag'{s) + (3)/{a + (3). In particular, it 
follows that h{oo) is the solution q smaller than 1 to the equation /(s) = s, and hence that 
the asymptotic final proportion of susceptible individuals at the end of a large outbreak 
is given by g{q). 

Remark. Volz (2008) expresses the equations for the development of the epidemic as the 
solutions to a system of three coupled differential equations for the variables pj and ps'- 

dh dps /. hg"{h) 

- -apih; —— = apsPi 1 



dt ' dt ^"^^ V 9'{h) 

dpi hg"{h) 

It is not difficult to see that their solution is given in terms of the solution h to (I3.24p 
by = 1 — + £{1 — \} and ps = The first equation is clearly satisfied, and 

the second follows by differentiating the formula for ps and using the first equation to re- 
express Then the sum of the second and third equations is satisfied by differentiating 
the expression for ps + Pi, and then using it once more to re-express (1 — 1/h). 



Remark. Although the asymptotics carried out in this section are not applicable to that 
case, a reasonably general Kermack-McKendrick epidemic also fits into this epidemic 
model, by taking Pn-i = 1 and by replacing G{du) by (A^ — l)~^G{du)] in the notation of 
Sectiondl we would have /3{u)du replacing (1 — ^{u))G{du). This leads formally to equa- 
tions determining the development of the epidemic which are asymptotically equivalent, 
for large A^, to those established in Section [2l For instance, ( I3.23P becomes 

h{u) = {l-h^-\u-v)}/3{u)du, 
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so that, writing s{u) for h{u)^~^, we obtain 

s{u) ~ exp |~ y ^ -^("^ ~ 

which is just fll.2p . 
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